智能论文笔记

Insights into the origin of halo mass profiles from machine learning

Luisa Lucie-Smith , Susmita Adhikari , Risa H. Wechsler

分类：人工智能 | 机器学习

2022-05-09

暗物质光环的质量分布是初始密度扰动通过质量积聚和合并的层次增长的结果。我们使用一个可解释的机器学习框架来提供对暗物质光环的球形平均质量概况的起源的物理见解。我们训练梯度促进的树算法，以预测聚类大小的光环的最终质量曲线，并衡量提供给算法的不同输入的重要性。我们在初始条件（ICS）中找到了两个主要量表，它们影响最终的质量曲线：大约在Haloes的Lagrangian Patch $ r_l $（$ r \ sim 0.7 \，r_l $）的比例下的密度，并且在大型中-scale环境（$ r \ sim 1.7〜r_l $）。该模型还标识了光环组装历史记录中的三个主要时间尺度，这些时间尺度影响最终轮廓：（i）晕圈内病毒化的，折叠的材料的形成时间，（ii）动态时间，捕获动态无移动的，插入的动态时间光环的第一个轨道（iii）的组成部分是第三个，最近的时间尺度，它捕获了对最近大规模合并事件外部特征的影响。尽管内部轮廓保留了IC的内存，但仅此信息就不足以对外部轮廓产生准确的预测。当我们添加有关Haloes的质量积聚历史的信息时，我们发现所有半径的预测概况都有显着改善。我们的机器学习框架为ICS和质量组装历史的作用提供了新的见解，并在确定集群大小的光环的最终质量概况中。

translated by 谷歌翻译

TruFor: Leveraging all-round clues for trustworthy image forgery detection and localization

Fabrizio Guillaro , Davide Cozzolino , Avneesh Sud , Nicholas Dufour , Luisa Verdoliva

分类：计算机视觉

2022-12-21

In this paper we present TruFor, a forensic framework that can be applied to a large variety of image manipulation methods, from classic cheapfakes to more recent manipulations based on deep learning. We rely on the extraction of both high-level and low-level traces through a transformer-based fusion architecture that combines the RGB image and a learned noise-sensitive fingerprint. The latter learns to embed the artifacts related to the camera internal and external processing by training only on real data in a self-supervised manner. Forgeries are detected as deviations from the expected regular pattern that characterizes each pristine image. Looking for anomalies makes the approach able to robustly detect a variety of local manipulations, ensuring generalization. In addition to a pixel-level localization map and a whole-image integrity score, our approach outputs a reliability map that highlights areas where localization predictions may be error-prone. This is particularly important in forensic applications in order to reduce false alarms and allow for a large scale analysis. Extensive experiments on several datasets show that our method is able to reliably detect and localize both cheapfakes and deepfakes manipulations outperforming state-of-the-art works. Code will be publicly available at https://grip-unina.github.io/TruFor/

translated by 谷歌翻译

LiFe-net: Data-driven Modelling of Time-dependent Temperatures and Charging Statistics Of Tesla's LiFePo4 EV Battery

Jeyhun Rustamov , Luisa Fennert , Nico Hoffmann

分类：机器学习

2022-12-16

Modelling the temperature of Electric Vehicle (EV) batteries is a fundamental task of EV manufacturing. Extreme temperatures in the battery packs can affect their longevity and power output. Although theoretical models exist for describing heat transfer in battery packs, they are computationally expensive to simulate. Furthermore, it is difficult to acquire data measurements from within the battery cell. In this work, we propose a data-driven surrogate model (LiFe-net) that uses readily accessible driving diagnostics for battery temperature estimation to overcome these limitations. This model incorporates Neural Operators with a traditional numerical integration scheme to estimate the temperature evolution. Moreover, we propose two further variations of the baseline model: LiFe-net trained with a regulariser and LiFe-net trained with time stability loss. We compared these models in terms of generalization error on test data. The results showed that LiFe-net trained with time stability loss outperforms the other two models and can estimate the temperature evolution on unseen data with a relative error of 2.77 % on average.

translated by 谷歌翻译

BLOOM: A 176B-Parameter Open-Access Multilingual Language Model

Teven Le Scao , Angela Fan , Christopher Akiki , Ellie Pavlick , Suzana Ilić , Daniel Hesslow , Roman Castagné , Alexandra Sasha Luccioni , François Yvon , Matthias Gallé

分类：自然语言处理

2022-11-09

Large language models (LLMs) have been shown to be able to perform new tasks based on a few demonstrations or natural language instructions. While these capabilities have led to widespread adoption, most LLMs are developed by resource-rich organizations and are frequently kept from the public. As a step towards democratizing this powerful technology, we present BLOOM, a 176B-parameter open-access language model designed and built thanks to a collaboration of hundreds of researchers. BLOOM is a decoder-only Transformer language model that was trained on the ROOTS corpus, a dataset comprising hundreds of sources in 46 natural and 13 programming languages (59 in total). We find that BLOOM achieves competitive performance on a wide variety of benchmarks, with stronger results after undergoing multitask prompted finetuning. To facilitate future research and applications using LLMs, we publicly release our models and code under the Responsible AI License.

translated by 谷歌翻译

antGLasso: An Efficient Tensor Graphical Lasso Algorithm

Bailey Andrew , David Westhead , Luisa Cutillo

分类： (统计)机器学习 | 机器学习

2022-11-05

The class of bigraphical lasso algorithms (and, more broadly, 'tensor'-graphical lasso algorithms) has been used to estimate dependency structures within matrix and tensor data. However, all current methods to do so take prohibitively long on modestly sized datasets. We present a novel tensor-graphical lasso algorithm that analytically estimates the dependency structure, unlike its iterative predecessors. This provides a speedup of multiple orders of magnitude, allowing this class of algorithms to be used on large, real-world datasets.

translated by 谷歌翻译

Deepfake audio detection by speaker verification

Alessandro Pianese , Davide Cozzolino , Giovanni Poggi , Luisa Verdoliva

分类：计算机视觉

2022-09-28

得益于深度学习的最新进展，如今存在复杂的生成工具，这些工具产生了极其现实的综合语音。但是，这种工具的恶意使用是可能的，有可能对我们的社会构成严重威胁。因此，合成语音检测已成为一个紧迫的研究主题，最近提出了各种各样的检测方法。不幸的是，它们几乎没有概括为在训练阶段从未见过的工具产生的合成音频，这使他们不适合面对现实世界的情况。在这项工作中，我们旨在通过提出一种仅利用说话者的生物特征的新检测方法来克服这个问题，而无需提及特定的操纵。由于仅在实际数据上对检测器进行训练，因此可以自动确保概括。建议的方法可以基于现成的扬声器验证工具实现。我们在三个流行的测试集上测试了几种这样的解决方案，从而获得了良好的性能，高概括能力和高度鲁棒性。

translated by 谷歌翻译

CometKiwi: IST-Unbabel 2022 Submission for the Quality Estimation Shared Task

Ricardo Rei , Marcos Treviso , Nuno M. Guerreiro , Chrysoula Zerva , Ana C. Farinha , Christine Maroti , José G. C. de Souza , Taisiya Glushkova , Duarte M. Alves , Alon Lavie

分类：自然语言处理 | 机器学习

2022-09-13

我们介绍了IST和Unmabel对WMT 2022关于质量估计（QE）的共享任务的共同贡献。我们的团队参与了所有三个子任务：（i）句子和单词级质量预测；（ii）可解释的量化宽松；（iii）关键错误检测。对于所有任务，我们在彗星框架之上构建，将其与OpenKIWI的预测估计架构连接，并为其配备单词级序列标记器和解释提取器。我们的结果表明，在预处理过程中合并参考可以改善下游任务上多种语言对的性能，并且通过句子和单词级别的目标共同培训可以进一步提高。此外，将注意力和梯度信息结合在一起被证明是提取句子级量化量化宽松模型的良好解释的首要策略。总体而言，我们的意见书在几乎所有语言对的所有三个任务中都取得了最佳的结果。

translated by 谷歌翻译

Towards a Sentiment-Aware Conversational Agent

Isabel Dias , Ricardo Rei , Patrícia Pereira , Luisa Coheur

分类：自然语言处理

2022-07-24

在本文中，我们根据两个模型提出了一个端到端情感感知的对话代理：答复情绪预测模型，该模型利用对话的上下文来预测适当的情感，以便代理人在其答复中表达表达；以及一个基于预测的情感和对话的上下文的条件的文本生成模型，以产生既适合上下文又适合情感的答复。此外，我们建议使用情感分类模型来评估代理商在模型开发过程中表达的情感。这使我们能够自动评估代理。自动和人类评估结果都表明，用预定义的句子集明确指导文本生成模型导致了明确的改进，包括表达的情感和生成文本的质量。

translated by 谷歌翻译

Generalized Beliefs for Cooperative AI

Darius Muglich , Luisa Zintgraf , Christian Schroeder de Witt , Shimon Whiteson , Jakob Foerster

分类：人工智能 | 机器学习

2022-06-26

自我玩法是在马尔可夫游戏中构建解决方案的常见范式，可以在协作环境中产生最佳政策。但是，这些政策通常会采用高度专业的惯例，这使与新颖伴侣的比赛变得困难。为了解决这一问题，最近的方法依赖于将对称性和惯例意识编码为政策培训，但是这些方法需要强烈的环境假设，并使政策培训变得复杂。因此，我们建议将惯例的学习转移到信仰空间。具体而言，我们提出了一种信念学习模型，该模型可以维持对培训时间未观察到的政策推出的信念，因此可以在考试时进行解码和适应新的惯例。我们展示了如何利用这一模型来搜索和培训各种政策池中最佳响应，以极大地改善临时团队游戏。我们还展示了我们的设置如何促进细微的代理惯例的解释性和解释性。

translated by 谷歌翻译

Building an Endangered Language Resource in the Classroom: Universal Dependencies for Kakataibo

Roberto Zariquiey , Claudia Alvarado , Ximena Echevarria , Luisa Gomez , Rosa Gonzales , Mariana Illescas , Sabina Oporto , Frederic Blum , Arturo Oncevay , Javier Vera

分类：自然语言处理

2022-06-21

在本文中，我们推出了一种新的通用依赖树木库，用于亚马逊尼亚的一种濒危语言：秘鲁在秘鲁说的Panoan语言Kakataibo。我们首先讨论实施的协作方法，事实证明，在本科生的计算语言课程的背景下创建树库有效。然后，我们描述了树库的一般细节以及针对拟议的注释实施的特定于语言的注意事项。我们最终对词性标记和句法依赖性解析进行了一些实验。我们专注于单语和转移学习设置，在这里我们研究了另一种Panoan语言资源的Shipibo-Konibo Treebos的影响。

translated by 谷歌翻译